Glr* - an Eecient Noise-skipping Parsing Algorithm for Context Free Grammars

نویسندگان

  • Alon Lavie
  • Masaru Tomita
چکیده

This paper describes GLR*, a parser that can parse any input sentence by ignoring unrecog-nizable parts of the sentence. In case the standard parsing procedure fails to parse an input sentence, the parser nondeterministically skips some word(s) in the sentence, and returns the parse with fewest skipped words. Therefore, the parser will return some parse(s) with any input sentence, unless no part of the sentence can be recognized at all. The problem can be deened in the following way: Given a context-free grammar G and a sentence S, nd and parse S 0-the largest subset of words of S, such that S 0 2 L(G). The algorithm described in this paper is a modiication of the Generalized LR (Tomita) parsing algorithm Tomita, 1986]. The parser accommodates the skipping of words by allowing shift operations to be performed from inactive state nodes of the Graph Structured Stack. A heuristic similar to beam search makes the algorithm computationally tractable. There have been several other approaches to the problem of robust parsing, most of which are special purpose algorithms Carbonell and Hayes, 1984], Ward, 1991] and others. Because our approach is a modiication to a standard context-free parsing algorithm, all the techniques and grammars developed for the standard parser can be applied as they are. Also, in case the input sentence is by itself grammatical, our parser behaves exactly as the standard GLR parser. The modiied parser, GLR*, has been implemented and integrated with the latest version of the Generalized LR Parser/Compiler Tomita et al., 1988], Tomita, 1990]. We discuss an application of the GLR* parser to spontaneous speech understanding and present some preliminary tests on the utility of the GLR* parser in such settings.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Construction of Eecient Generalized Lr Parsers

We show how LR parsers for the analysis of arbitrary context-free grammars can be derived from classical Earley's parsing algorithm. The result is a Generalized LR parsing algorithm working at complexity O(n 3) in the worst case, which is achieved by the use of dynamic programming to represent the non-deterministic evolution of the stack instead of graph-structured stack representations, as has...

متن کامل

String Shuffling over a Gap between Parsing and Plan Recognition

We propose a new probabilistic plan recognition algorithm YR based on an extension of Tomita’s Generalized LR (GLR) parser for grammars enriched with the shuffle operator. YR significantly outperforms previous approaches based on topdown parsers, shows more consistent run times among similar libraries, and degrades more gracefully as plan library complexity increases. YR also lifts the restrict...

متن کامل

In Recent Advances in Parsing Technology

This chapter describes GLR*, a parser that can parse any input sentence by ignoring unrecognizable parts of the sentence. Using an eecient algorithm, the parser is capable of nding and parsing a maximal subset of the original input that is parsable, and therefore return the parse with fewest skipped words. The parser returns some parse(s) for any input sentence, unless no part of the sentence c...

متن کامل

Elkhound: A Fast, Practical GLR Parser Generator

The Generalized LR (GLR) parsing algorithm is attractive for use in parsing programming languages because it is asymptotically efficient for typical grammars, and can parse with any context-free grammar, including ambiguous grammars. However, adoption of GLR has been slowed by high constant-factor overheads and the lack of a general, user-defined action interface. In this paper we present algor...

متن کامل

A New Method for Dependent Parsing

Dependent grammars extend context-free grammars by allowing semantic values to be bound to variables and used to constrain parsing. Dependent grammars can cleanly specify common features that cannot be handled by context-free grammars, such as length fields in data formats and significant indentation in programming languages. Few parser generators support dependent parsing, however. To address ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1993